Detecting and Revamping of X-Outliers in Time Series Database

نویسندگان

  • S. Sridevi
  • S. Abirami
  • S. Rajaram
  • Ning Zhong
  • Muneaki Ohshima
  • J. Chen
  • W. Li
  • A. Lau
  • J. Cao
چکیده

Dataset with Outliers causes poor accuracy in future analysis of data mining tasks. To improve the performance of mining task, it is necessary to detect and revamp of outliers which are there in the dataset. Existing techniques like ARMA (Auto-Regressive Moving Average), ARIMA (AutoRegressive Integrated Moving Average) and Multivariate Linear Gaussian state space model don't consider the periodicity for outlier detection. The above methods are used to find out only Y Outliers which are present in Y axis. These methods are not applicable to detect the time at which the peculiar data occurs (so called X-Outliers). This paper focuses different methods for detecting and revamping of X-Outliers that have abnormal data according to a known periodicity. These are practically applied in fraud detection, Market-basket analysis and medical applications to detect certain abnormal diseases. First the data is modeled to get the trend of the data and to remove noises by means of kernel smoothing. Next the outliers are detected by similarity measurements. If the dataset has outliers it can be replaced by considering periodic indices from the historical dataset. The performance of system is measured by precision, recall and F Score. The proposed method is tested with three different time series datasets namely, Electricity power consumption dataset, Weather dataset and

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identification of outliers types in multivariate time series using genetic algorithm

Multivariate time series data, often, modeled using vector autoregressive moving average (VARMA) model. But presence of outliers can violates the stationary assumption and may lead to wrong modeling, biased estimation of parameters and inaccurate prediction. Thus, detection of these points and how to deal properly with them, especially in relation to modeling and parameter estimation of VARMA m...

متن کامل

New optimized model identification in time series model and its difficulties

Model identification is an important and complicated step within the autoregressive integrated moving average (ARIMA) methodology framework. This step is especially difficult for integrated series. In this article first investigate Box-Jenkins methodology and its faults in detecting model, and hence have discussed the problem of outliers in time series. By using this optimization method, we wil...

متن کامل

Detecting Outliers in Exponentiated Pareto Distribution

In this paper, we use two statistics for detecting outliers in exponentiated Paretodistribution. These statistics are the extension of the statistics for detecting outliers inexponential and gamma distributions. In fact, we compare the power of our test statisticsbased on the simulation study and identify the better test statistic for detecting outliers inexponentiated Pareto distribution. At t...

متن کامل

A Bayesian Approach for Detecting Outliers in ARMA Time Series

The presence of outliers in time series can seriously affect the model specification and parameter estimation. To avoid these adverse effects, it is essential to detect these outliers and remove them from time series. By the Bayesian statistical theory, this article proposes a method for simultaneously detecting the additive outlier (AO) and innovative outlier (IO) in an autoregressive moving-a...

متن کامل

Outlier Detection in Multivariate Time Series via Projection Pursuit

This article uses Projection Pursuit methods to develop a procedure for detecting outliers in a multivariate time series. We show that testing for outliers in some projection directions could be more powerful than testing the multivariate series directly. The optimal directions for detecting outliers are found by numerical optimization of the kurtosis coefficient of the projected series. We pro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1977